An introduction to speech-based technologies for Natural Language Processing applications

Published in Mexican NLP Summer School 2021, 2021

Since the last two decades, the amount of data generated and collected has grown exponentially, and especially through the rise of unstructured data such as images, videos or text. More recently, audio and speech data have gained a large interest, for example through voice assistants. Companies like Google, Facebook, Apple, and Amazon have shown an increasing interest in professionals with skills and tools for ‘understanding’ and ‘transforming’ the massive flow of speech data in relevant information. Some of the most important speech-based technologies are voice activity detection, speaker diarization and identification, and automatic speech recognition. These techologies are often used as an input to various NLP applications afterwards. This brief workshop will give you a set of basic tools for grasping the main aspects of speech-based technologies and how they can be implemented in real-life cases.

Find out more information in our decadated Github repositories:

  • Slides: Slides presented at the event: link
  • Tutorial: This repository contains the code for speech components (ASR and Speaker Verification) for the workshop (Tutorial) given at the Mexican NLP Summer School. Github Tutorial repo
  • Demo: An easy to use app that conveys Automatic Speech Recognition + Speaker Verification (using Speechbrain) and a Natural Language Processing task. Github Demo repo